Grapheme-Based Automatic Speech Recognition Using KL-HMM

نویسندگان

  • Mathew Magimai-Doss
  • Ramya Rasipuram
  • Guillermo Aradilla
  • Hervé Bourlard
چکیده

The state-of-the-art automatic speech recognition (ASR) systems typically use phonemes as subword units. In this work, we present a novel grapheme-based ASR system that jointly models phoneme and grapheme information using Kullback-Leibler divergence-based HMM system (KL-HMM). More specifically, the underlying subword unit models are grapheme units and the phonetic information is captured through phoneme posterior probabilities (referred as posterior features) estimated using a multilayer perceptron (MLP). We investigate the proposed approach for ASR on English language, where the correspondence between phoneme and grapheme is weak. In particular, we investigate the effect of contextual modeling on grapheme-based KL-HMM system and the use of MLP trained on auxiliary data. Experiments on DARPA Resource Management corpus have shown that the grapheme-based ASR system modeling longer subword unit context can achieve same performance as phoneme-based ASR system, irrespective of the data on which MLP is trained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving grapheme-based ASR by probabilistic lexical modeling approach

There is growing interest in using graphemes as subword units, especially in the context of the rapid development of hidden Markov model (HMM) based automatic speech recognition (ASR) system, as it eliminates the need to build a phoneme pronunciation lexicon. However, directly modeling the relationship between acoustic feature observations and grapheme states may not be always trivial. It usual...

متن کامل

On recognition of non-native speech using probabilistic lexical model

Despite various advances in automatic speech recognition (ASR) technology, recognition of speech uttered by non-native speakers is still a challenging problem. In this paper, we investigate the role of different factors such as type of lexical model and choice of acoustic units in recognition of speech uttered by non-native speakers. More precisely, we investigate the influence of the probabili...

متن کامل

Decision Tree Clustering for Kl-hmm

Recent Automatic Speech Recognition (ASR) studies have shown that Kullback-Leibler diverge based hidden Markov models (KL-HMMs) are very powerful when only small amounts of training data are available. However, since the KL-HMMs use a cost function that is based on the Kullback-Leibler divergence (instead of maximum likelihood), standard ASR algorithms such as the commonly used decision tree cl...

متن کامل

Using out-of-language data to improve an under-resourced speech recognizer

Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we report how to boost the performance of an Afrikaans automatic speech recognition system by using already available Dutch data. We successfully exploit available multilingual resources through (1) posterior features, estimated by multilayer perceptrons (MLP) and (2) subspace Ga...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011